Unsupervised Hypernym Detection by Distributional Inclusion Vector Embedding

نویسندگان

  • Haw-Shiuan Chang
  • ZiYun Wang
  • Luke Vilnis
  • Andrew McCallum
چکیده

Modeling hypernymy, such as poodle is-a dog, is an important generalization aid to many NLP tasks, such as entailment, relation extraction, and question answering. Supervised learning from labeled hypernym sources, such as WordNet, limit the coverage of these models, which can be addressed by learning hypernyms from unlabeled text. Existing unsupervised methods either do not scale to large vocabularies or yield unacceptably poor accuracy. This paper introduces distributional inclusion vector embedding (DIVE), a simple-to-implement unsupervised method of hypernym discovery via per-word non-negative vector embeddings learned by modeling diversity of word context with specialized negative sampling. In an experimental evaluation more comprehensive than any previous literature of which we are aware— evaluating on 11 datasets using multiple existing as well as newly proposed scoring metrics—we find that our method can provide up to double or triple the precision of previous unsupervised methods, and also sometimes outperforms previous semi-supervised methods, yielding many new state-ofthe-art results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

AHyDA: Automatic Hypernym Detection with Feature Augmentation

English. Several unsupervised methods for hypernym detection have been investigated in distributional semantics. Here we present a new approach based on a smoothed version of the distributional inclusion hypothesis. The new method is able to improve hypernym detection after testing on the BLESS dataset. Italiano. Sulla base dei metodi non supervisionati presenti in letteratura, affrontiamo il t...

متن کامل

Distributional Lexical Entailment by Topic Coherence

Automatic detection of lexical entailment, or hypernym detection, is an important NLP task. Recent hypernym detection measures have been based on the Distributional Inclusion Hypothesis (DIH). This paper assumes that the DIH sometimes fails, and investigates other ways of quantifying the relationship between the cooccurrence contexts of two terms. We consider the top features in a context vecto...

متن کامل

Chasing Hypernyms in Vector Spaces with Entropy

In this paper, we introduce SLQS, a new entropy-based measure for the unsupervised identification of hypernymy and its directionality in Distributional Semantic Models (DSMs). SLQS is assessed through two tasks: (i.) identifying the hypernym in hyponym-hypernym pairs, and (ii.) discriminating hypernymy among various semantic relations. In both tasks, SLQS outperforms other state-of-the-art meas...

متن کامل

Distributional Hypernym Generation by Jointly Learning Clusters and Projections

We propose a novel word embedding-based hypernym generation model that jointly learns clusters of hyponym-hypernym relations, i.e., hypernymy, and projections from hyponym to hypernym embeddings. Most of the recent hypernym detection models focus on a hypernymy classification problem that determines whether a pair of words is in hypernymy or not. These models do not directly deal with a hyperny...

متن کامل

Learning Concept Hierarchies from Text with a Guided Hierarchical Clustering Algorithm

We present an approach for the automatic induction of concept hierarchies from text collections. We propose a novel guided agglomerative hierarchical clustering algorithm exploiting a hypernym oracle to drive the clustering process. By inherently integrating the hypernym oracle into the clustering algorithm, we overcome two main problems of unsupervised clustering approaches relying on the dist...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1710.00880  شماره 

صفحات  -

تاریخ انتشار 2017